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' Abstract 

(N ! 

Sequence representations supporting queries access, select and rank are at the core of many 
data structures. There is a considerable gap between different upper bounds, and the few lower 
bounds, known for such a representation, and how they interact with the space used. In this 
article we prove a strong lower bound for rank, which holds for rather permissive assumptions 
I on the space, and give matching upper bounds that require only a compressed representation of 

the sequence. Within this compressed space, operations access and select can be solved within 
almost-constant time. 
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1 Introduction 

A large number of data structures build on sequence representations. In particular, supporting the 
following three queries on a sequence ^[lin] over alphabet [1,0"] has proved extremely useful: 

• access{S,i) gives S[i]; 

(N ■ 

^ ■ • selecta{S, j) gives the position of the jth occurrence of a € [l,o-] in S; and 

cN : 

• ranka{S,i) gives the number of occurrences of a G [l,cr] in S'[l,i]. 

For example, Ferragina and Manzini's FM-index [9j, a compressed indexed representation for 
text collections that supports pattern searches, is most successfully implemented over a sequence 
representation supporting access and rank [10]. Grossi et al. |18j had used earlier similar techniques 
^ . for text indexing, and invented wavelet trees, a compressed sequence representation that solves the 

^ I three queries in time O(lgcr). The time was reduced to Q( igf^^ ) with multiary wavelet trees 

[inillZ]- Golynski et al. [I6j used these operations for representing labeled trees and permutations, 
and proposed another representation that solved the operations in time 0{lglga), and some even in 
constant time. This representation was made compressed by Barbay et al. [|lj . Further applications 
of the three operations to multi-labeled trees and binary relations were uncovered by Barbay et 
al. [2]. Ferragina et al. [8] and Gupta et al. ^20j devised new applications to XML indexing. Barbay 
et al. O [1] gave applications to representing permutations and inverted indexes. Claude and 
Navarro [7] presented applications to graph representation. Makinen and Valimaki [29] and Gagie 
et al. [13] applied them to document retrieval on general texts. 

The most basic case is that of bitmaps, when a = 2. In this case obvious applications are set 
representations supporting membership and predecessor search. Assume along this article the RAM 
model with word size w = ^}{lgn). Jacobson [21] achieved constant-time rank using o{n) extra 
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bits on top of a plain representation of S, and Munro |23] and Clark [6] achieved also constant- 
time select. Golynski [T^ showed a lower bound of J7(nlglgn/lg?7-) extra bits for supporting both 
operations in constant time if S is to be represented in plain form, and gave matching upper bounds. 
When S can be represented arbitrarily, Patrascu [25] achieved Ig (^) + 0(n/lg'^n) bits of space, 
where m is the number of Is in S and c is any constant, and showed this is optimal [28] , 

For general sequences, a useful measure of compressibility is the zero-order entropy of S, 
Hq{S) = X]aG[i,cr] TT ^) where ria is the number of occurrences of a in S. This can be extended 
to the k-th order entropy, Hk{S) = ^ Z^Aefi 0-]'= \Ta\Ho{Ta), where Ta is the string of symbols fol- 
lowing A;-tuple A in S. It holds < Hk{S) < Hk-i{S) < Ho{S) < Igcr for any k, but the entropy 
measure is only meaningful for k < Ig^ n. See Manzini |22] and Gagie |12) for a deeper discussion. 

When representing sequences supporting these operations, we may aim at using 0(nlgo") bits 
of space, but frequently it is useful to achieve less space. We may aim at succinct representation of 
5, taking n Ig a + o(n Igcr) bits, at a zero- order compressed representation, taking at most nHQ{S) + 
o(n Iga) bits (we might also wish to compress the redundancy, o(nlga), to achieve for example 
nHo{S) + o{nHQ{S))), or at a high-order compressed representation, nHk{S) + o(nlga). 

Upper and lower bounds for sequence representations supporting the three operations are far less 
understood over arbitrary alphabets. When a = 0(polylog n), the three operations can be carried 
out in constant time over a data structure using nHQ{S) -\- o{n) bits [10]. For larger alphabets, 
this solution requires the same space and answers the queries in timeO(i^) Another 
class of solutions [Ml (121 H]; especially attractive for "large alphabets", achieves time 0{\g\ga) for 
rank. For access and select they offer complementary complexities, where one of the operations 
is constant-time and the other requires 0(\g\ga) time. They achieve zero-order compressed space, 
nHQ{S) + o{nHQ{S)) -\- o(n) bits [T], and even high-order compressed space, nHk{S) + o(n Iga) for 
any k = o(lg^ n) [19J. 

There are several curious aspects in the map of the current solutions for general sequences. On 
one hand, the times for access and select seem to be complementary, whereas that for rank is 
always the same. On the other hand, there is no smooth transition between the complexity of one 
solution, 0( igf^„ ), and that of the other, 0(lglgo"). 

The complementary nature of access and select is not a surprise. Golynski |15j gave lower 
bounds that relate the time performance that can be achieved for these operations with the redun- 
dancy of an encoding of S on top of its information content. The lower bound acts on the product 
of both times, that is, if t and t' are the time complexities, and p is the bit-redundancy per symbol, 
then p ■ t ■ t' = J7((lg holds for a wide range of values of a. The upper bounds for large 

alphabets [161 fT9] match this lower bound. 

Despite operation rank seems to be harder than the others (at least no constant-time solution 
exists except for polylog-sized alphabets), no general lower bounds on this operation have been 
proved. Only a recent result for the case in which S must be encoded in plain form states that the 
time for rank must be Vt{{\ga) / p), for t = O(i^f^) [19] . 

In this article we make several contributions that help close the gap between lower and upper 
bounds on sequence representation. 

1. We prove the first general lower bound on rank, which shows that this operation is, in 
a sense, noticeably harder than the others: No structure using 0{n ■ u;0(i)) bits can answer 
rank queries in time o(lg y^^)- Note the space includes the rather permissive 0(n -polylog n). 
For this range of times our general bound is much stronger than the existing restricted one 
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|19j . which only forbids achieving it within nlga + 0(nlg(7/lg bits. Our fower bound 
uses a reduction from predecessor queries. 

2. We give a matching upper bound for rank, using 0{n Ig a) bits of space and answering queries 
in time 0(lg ^^)- This is lower than any time complexity achieved so far for this operation 

within 0{n ■ w'-^^^^) bits, and it elegantly unifies both known upper bounds under a single and 
lower time complexity. This is achieved via a reduction to a predecessor query structure that 
is tuned to use slightly less space. 

3. We derive succinct and compressed representations of sequences that achieve time 0(1 + 

for access, select and rank, improving upon previous results [10]. This yields constant-time 
operations for a = w'^^^^ . Succinctness is achieved by replacing universal tables used in other 
solutions with bit manipulations on the RAM model. Compression is achieved by combining 
the succinct representation with existing compression boosters. 

4. We derive succinct and compressed representations of sequences over larger alphabets, which 
achieve time 0(lg|^) for rank, which is optimal, and almost-constant time for access and 
select. The result improves upon almost all succinct and compressed representations proposed 
so far. [m O [H [H]. This is achieved by plugging our 0{nlga)-hit solutions into existing 
succinct and compressed data structures. 

Our results assume a RAM model where bit shifts, bitwise logical operations, and arithmetic 
operations (including multiplication) are permitted. Otherwise we can simulate them with universal 
tables within o(n) extra bits of space, but all our Igw in the upper bounds become Iglgn. 

2 Lower Bound for rank 

Our technique is to reduce from a predecessor problem and apply the density-aware lower bounds 
of Patrascu and Thorup [26] . Assume that we have n keys from a universe of size u = na, then 
the keys are of length i = \gu = Ig n -|- Ig cr. According to branch 2 of Patrascu and Thorup's 
result, the time for predecessor queries in this setting is lower bounded by Q, (\g where 
a = lg{s/n) +\gw and s is the space in words of our representation (the lower bound is in the 
cell probe model for word length w, so the space is always expressed in number of cells). We will 
assume that a = 0{n); the other case will be considered at the end of the section. 

The reduction is as follows. We divide the universe n ■ a into a intervals, each of size n. This 
division can be viewed as a binary matrix of n columns by a rows, where we set a 1 at row r and 
column c iff element (r — 1) ■ n + c belongs to the set. We will use three data structures. 

1. A partial sums structure R stores the number of elements in each row. It is a bitmap 
concatenating the a unary representations, 1"''0, of the number of Is in each row r G [IjO"]. 
Thus R is of length n + a and can give in constant time the number of Is up to (and 
including) any row r, count{r) = ranki{R, selecto{R, r)) = selecto{R, r) — r, in constant time 
and 0{n + a) = 0{n) bits of space [231 E]- 

2. A column mapping data structure C that maps the original columns into a set of columns 
where (i) empty columns are eliminated, and (ii) new columns are created when two or more Is 
fall in the same column. C is a bitmap concatenating the n unary representations, 1"'=0, of the 
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numbers ric of Is in each column c S [1, n]. So C is of length 2n. Note that the new matrix of 
mapped columns has also n columns (one per element in the set) and exactly one 1 per column. 
The original column c is then mapped to col{c) = ranki{C, selecto{C, c)) = selectQ^C, c) — c, 
using constant time and 0(n) bits. Note that col{c) is the last of the columns to which the 
original column c might have been expanded. 

3. A string ^[Ijn] over alphabet [1,0"], so that S[c] = r iff the only 1 at column c (after column 
remapping) is at row r. Over this string we build a data structure able to answer queries 



Queries are done in the following way. Given an element x G we first deompose it into 

a pair (r, c) where x = {r — 1) ■ n + c. In a first step, we compute count{r — 1) in constant time. 
This gives us the count of elements up to point (r — 1) • n. Next we must compute the count of 
elements in the range [(r — 1) • n + 1, (r — 1) • n + c]. For doing that we first remap the column to 
d = col{c) in constant time, and finally compute rankr{S, c'), which gives the number of Is in row 
r up to column c' . Note that if column c was expanded to several ones, we are counting the Is up 
to the last of the expanded columns, so that all the original Is at column c are counted at their 
respective rows. Then the rank of the predecessor of x is count{r — 1) + rankr{S, col{c)). We can 
then associate any information to it in an array indexed by such rank. 

Theorem 1 Given a data structure that supports rank queries on strings of length n over alphabet 
[1,0"] in time t{n, a) and using s{n, a) bits of space, we can solve a predecessor search for n integers 
from universe [l,no"] in time t{n,a) + 0(1) using a data structure that occupies s{n,a) + 0(n) bits 
of space. 

By the reduction above we get that any lower bound for predecessor search for n keys over 
alphabet of size na must also apply to rank queries on sequences of lentgh n over alphabet a. 
In our case, if we aim at using 0{n ■ w^^^^) bits of space, this lower bound (branch 2 [26]) is 



For a = G(n) and w = 0(lgn), the bound is simply 0(lglgo"). In case a = ^{n), our lower 
bound cannot be better than this Q{lg^^), as otherwise we could enlarge a artificially by just 
declaring it much larger than its real limit, and have better lower bounds. 

Theorem 2 Any data structure that uses space 0{n ■ w'-^^^^) bits to represent a sequence of length 
n over alphabet [1,0"], must use time r2(lg|^) to answer rank queries. 

Assume to simplify that w = 0(lgn). This lower bound is trivial for small Igo" = O(lglgn) (i.e., 
a = 0(polylog n)), where constant-time solutions for rank exist that require only nHQ[S) + o{n) 
bits [TU]. On the other hand, if a is sufficiently large, Igo" = i7((lg Ig n)^+'^) for any constant e > 0, 
the lower bound becomes simply r2(lglgo"), where it is matched by known compact and compressed 
solutions [Ml m [l9] requiring as little as nHo{S) + o{nHo{S)) + o(n) or nHk{S) + o(n Ig a) bits. 

The only range where this lower bound has not yet been matched is a;(lglgn) = Igo" = 
o((lglgn)^''''^), for any constant e > 0. The next section presents a new matching upper bound. 



rankr{S, c). 
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3 Optimal Upper Bound for rank 



We now show a a matching upper bound with optimal time and space O(nlgcr) bits. In the next 
sections we make the space succinct and even compressed. 

We reduce the problem to predecessor search and then use a convenient solution for that prob- 
lem. The idea is simply to represent the string S'fljTi] over alphabet [l,cr] as a matrix of a rows 
and n columns, and regard the matrix as the set of n points {{S[c] — 1) ■ n + c, c G [1, n]} over the 
universe [l,n(T]. Then we store an array of n cells containing {r,rankr{S,c)), where r = S[c], for 
the point corresponding to column c in the set. 

To query rankr{S, c) we compute the predecessor of (r — 1) • n + c. If it is a pair (r, v), for some 
V, then the answer is v. Else the answer is zero. 

This solution requires nlgcj + nlgn bits for the pairs, on top of the space of the predecessor 
query. If cr < n we can reduce this extra space to 2n\ga by storing the pairs {r,rankr{S,c)) 
in a different way. We virtually cut the string into chunks of length a, and store the pair as 
(r, rankr{S, c) —rankr{S,c — (c mod cr))). The rest of the rank^ information is obtained in constant 
time and 0{n) bits using Golynski et al.'s [16] reduction to chunks: They store a bitmap ^[l,2n] 
where the matrix is traversed row-wise and we append to A a 1 for each 1 found in the matrix and 
a each time we move to the next chunk (so we append n/cj Os per row). Then the remaining 
information for rankr{S, c) is rankr{S, c — (c mod a)) = selecto{A,pi) — selecto{A,pQ) — (c div a), 
where po = {r — 1) ■ n/a and pi = Po + (c div a) (we have simplified the formulas by assuming a 
divides n). 

Theorem 3 Given a solution for predecessor search on a set of n keys chosen from a universe 
of size u, that occupies space s{n,u) and answers in time t{n,u), there exists a solution for rank 
queries on a sequence of length n over an alphabet [1,ct] that runs in time t{n,na) + 0(1) and 
occupies s{n,na) + 0(n Igo") bits of space. 

In the extended version of their article, Patrascu and Thorup [27j give an upper bound matching 
the lower bound of branch 2 and using 0{nlgu) bits for n elements over a universe [l,n]. In the 
Appendix we show that the same time can be achieved with space 0(nlg(n/n)), which is not 
surprising (they have given hints, actually) but we opt for completeness. By using this predecessor 
data structure, the following result is immediate. 

Theorem 4 A string S[l,n] over alphabet [l,cr] can be represented using 0{nlga) bits, so that 
operation rank is solved in time 0{lg^^). 

Note that, within this space, operations access and select can also be solved in constant time. 

4 Optimal- time rank in Succinct and Compressed Space 

We start with a sequence representation using nlga -|- o(nlgcj) bits (i.e., succinct) that answers 
access and select queries in almost-constant time, and rank in time 0(lg ^^)- This is done in two 

phases: a constant-time solution for a = w'^^^\ and then a solution for general alphabets. Then 
we turn into compressed representations. 
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4.1 Succinct Representation for Small Alphabets 

Using multiary wavelet trees pJQ] we can obtain succinct space and 0( ig^J„ ) time for access, select 
and rank. This is constant for Igcr = O(lglgn). We start by extending this result to the case 
Igcj = 0(lgu;), as a base case for handling larger alphabets thereafter. More precisely, we prove 
the following result. 

Theorem 5 A string S[l,n] over alphabet [l,cr] can be represented using n Ig cr + o(n Ig o") bits so 
that operations access, select and rank can be solved in time 0(1 + ^^)- If = w^^^\ the space 
is nflgo"] + o(n) bits and the operation times are 0{1). 

A multiary wavelet tree for ^[l,?!] divides, at the root node v, the alphabet [1,ct] into r con- 
tiguous regions of the same size. A sequence n] recording the region each symbol belongs to is 
stored at the root node (note i?„ is a sequence over alphabet This node has r children, each 

handling the subsequence of S formed by the symbols belonging to a given region. The children 
are decomposed recursively, thus the wavelet tree has height 0(lg^a). Queries access, select and 
rank on sequence Sfl, n] are carried out via 0{lgj. a) similar queries on the sequences stored at 
wavelet tree nodes [18]. By choosing r such that Igr = 0(lglgn), it turns out that the operations 
on the sequences can be carried out in constant time, and thus the cost of the operations on 
the original sequence S is O(i^f^) [TO] - 

In order to achieve time O(j^), we need to handle in constant time the operations over alpha- 
bets of size r = , for some < /3 < 1, so that Igr = Q{lgw). This time we cannot resort to 
universal tables of size o(n), but rather must use bit manipulation on the RAM model. 

The sequence is stored as the concatenation of n fields of length Igr, into consecutive 

machine words. Thus achieving constant-time access is trivial: To access Rv[i] we simply extract 
the corresponding bits, from the (1 + (i — 1) • Ig r)-th to the {i ■ Ig r)-th, from one or two consecutive 
machine words, using bit shifts and masking. 

Operations rank and select are more complex. We will proceed by cutting the sequence Rv into 
blocks of length b = symbols, for some (3 < a < 1. First we show how, given a block number i 
and a symbol a, we extract from R[l, b] = Ry[{i — 1) ■ h + l,i ■ b] a bitmap B[l, b] such that B[j] = 1 
iff R[j] = a. Then we use this result to achieve constant-time rank queries. Next, we show how to 
solve predecessor queries in constant time, for several fields of length Ig w bits fitting in a machine 
word. Finally, we use this result to obtain constant-time select queries. 

4.1.1 Projecting a Block 

Given sequence b] = R^ll + {i — 1) ■ b, i ■ b], which is of length b ■ £ = Igr < w"' Igw = o{'w) 
bits, where i = lgr, and given a G [1, r], we extract B[l, b • £] such that B[j • £] = 1 iff R\j] = a. 

To do so, we first compute X = a ■ (0^~^1)''. This creates b copies of a within ^-bit long fields. 
Second, we compute Y = R XOR X, which will have zeroed fields at the positions j where R[j] = a. 
To identify those fields, we compute Z = (10^"^)'' — Y, which will have a 1 at the highest bit of the 
zeroed fields in Y. Now W = Z AND (10^~^)* isolates those leading bits. 

4.1.2 Constant-time rank Queries 

We now describe how we can do rank queries in constant time for i?^[l,n]. Our solution follows 
that of Jacobson [21 1. We choose a superblock size s = and a block size b = {y/w — l)/lgr. 
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For each a G [^,r], we store the accumulated values per superblock, ranka{Rv,i • s) for all 1 < 
i < n/s. We also store the within-superblock accumulated values per block, ranka{Rv,i ■ b) — 
ranka{Rv, [{i • b)/s\ ■ s), for 1 < i < n/b. Both arrays of counters require, over all symbols, 
r{{n/s) • w + (n/b) - Igs) = 0{nwl^ {Igw)^ / y/w) bits. Added over the 0(^^) wavelet tree levels, the 

space required is 0{n\g(j\gw /w^/'^~^) bits. This is o{n\ga) for any /3 < 1/2, and furthermore it 
is o(n) if o" = w^^^\ 

To solve a query ranka{Rv,i), we need to add up three values: (i) the superblock accumulator 
at position [V'Sj ; (ii) the block accumulator at position \_i/b\, {in), the bits set at B[l, (i mod b)-i], 
where B corresponds to the values equal to a in Ry[li/b\ -6+1, [i/b\ ■ b + b]. We have shown in 
Section 14.1.1 1 how to extract B[l,b-i], so we count the number of bits set in C = B and l"^* 

This counting is known as a popcount operation. Given a bit block of length bi = y/w — 1, with 
bits set at positions multiple of we popcount it using the following steps: 

1. We first duplicate the block b times into b fields. That is, we compute X = C•(0^^~ll)^ 

2. We now isolate a different bit in each different field. This is done with Y = X and (O^^IO^"^)''. 
This will isolate the ith aligned bit in field i. 

3. We now sum up all those isolated bits using the multiphcation Z = Y ■ (0''^+^-^)''. The end 
result of the popcount operation lies at the bits Z\fP'£ + 1, + Ig 6]. 

4. We finally extract the result as c = (Z > h'^t) and (l^s^). 



4.1.3 Constant-time select Queries 

We now describe how we can do select queries in constant time for i?t,[l,n]. Our solution follows 
that of Clark |6j. For each a G [^yf], consider the virtual bitmap -Ba[l,n] so that Ba[j] = 1 iff 
Rv[j] = a. We choose a superblock size s = w'^ and a block size b = w^^'^ /{2\gr). Superblocks 
contain s 1-bits and are of variable length. They are called dense if their length is at most w^, and 
sparse otherwise. We store all the positions of the Is in sparse superblocks, which requires 0{n/w) 
bits of space as there are at most n/w^ sparse superblocks. For dense superblocks we only store 
their starting position in Ba and a pointer to a memory area. Both pointers require 0{n/w) bits 
as well. 

In the memory area of dense superblocks, we divide them into blocks of b Is. Blocks are called 
dense if their length is at most w'^^^ , and sparse otherwise. We store all the positions of the Is 
in sparse blocks. Since each position requires only \g{w^) as it is within a dense superblock, and 
there are at most n/nP'^^ sparse blocks, the total space for sparse blocks is 0{{n/vLp'^^)b\gw) = 
0{n{\gwy' /w^/^) bits. For dense blocks we store only their starting position within their dense 
superblock, which requires 0{{n/b)\gw) = 0{n{\gwf /w''/^) bits. 

The space, added over the r symbols, is 0{rn{\gw)'^ /w^^^) = 0{n{\gw)'^ /w^/^~^). Summing 
for O(j^) wavelet tree levels, the total space is 0{n\ga\gw /w^/^^^) bits. This is o{n\ga) for any 

/3 < 1/3, and furthermore it is o{n) if cr = w^^^\ 

In order to compute a selecta{Rv, j) query, we use the data structures for virtual bitmap Ba[l, n]. 
If [j/s\ is a sparse superblock, then the answer is readily stored. If it is a dense superblock, we 
only know its starting position and the offset o = j — (j mod s) of the query within its superblock. 
Now, if [o/b\ is a sparse block in its superblock, then the answer (which must be added to the 
starting position of the superblock) is readily stored. If it is a dense block, we only know its starting 
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position in Ba, but now we only have to complete the search within an area of length w^^^ in Ba- 
in Section [4.1.11 we showed how to extract a chunk B[l,b ■ i] from Ry, so that B[i ■ i] = Ba[i]- Now 
we detail how we complete a select query within a chunk of length b-£ = 0{w^^^) for the remaining 
j' = j — {j mod b) bits. 

This is based on doing about w^^^ parallel popcount operations on about w^^^ bit blocks. We 
proceed as follows: 

1. Duplicate B into b superfields with X = B ■ (O'^^^l)'', where k = 2b'^£ is the superfield size. 

2. Compute Y = X and (ok-^ibi^ . . . (Qfc-2£^2£)(o'^-^l^). This operation wih keep only the first 
i aligned bits in superfield i. 

3. Do popcount in parallel on all superfields using the algorithm described in Section [4.1.21 Note 
that each superfield will have capacity k = 2b'^£, but only the first bi bits in it are set, and 
the alignment is i. Thus the popcount operation will have enough available space in each 
superblock to operate. 

4. Let Z contain all the partial counts for all the prefixes of B. We need the position in Z of 
the first count equal to j'. We use the same projecting method described in Section [4.1.11 to 
spot the superfields equal to j' (the only difference is that superfields are much wider than 
Igw, namely of width i = k, but still all fits in a machine word). This method returns a word 
W[l, 2b^£] such that W[k ■ i] = 1 iS the ith superfield of Z is equal to j' . 

5. Isolate the least significant bit of W with V = W and {W xor {W — 1)). 

6. The final answer to selecti{B,j') is the position of the only 1 in V, divided by k. This is 
easily computed by using monotone minimum perfect hash functions (mmphf) over the set 
{2^\ I < i < b}. Existing data structures [1] take constant time and 0(61gu;) = 0{w) bits. 
Such a data structure is universal and requires the same space as systemwide pointers. 

4.2 Succinct Representation for Larger Alphabets 

We assume now Ig cr = u}{lgw); otherwise the previous section achieves succinctness and constant 
time for all operations otherwise. 

We build on Golynski et al.'s solution [16]. They first cut S into chunks of length a. With 
bitvector yl[l,2n] described in Section [3] they reduce all the queries, in constant time, to within 
a chunk. For each chunk they store a bitmap X[1,2(t] where the number of occurrences of each 
symbol a £ [1,0'] in the chunk, Ua, is concatenated in unary, X = l^^Ol^^O . . . 1"'^0. Now they 
introduce two complementary solutions. 

Constant-time Select. The first one stores, for each consecutive symbol a £ [l,cr], the chunk 
positions where it appears, in increasing order. Let vr be the resulting permutation, which is 
stored with the representation of Munro et al. [24J. This requires crlgcj(l + l//(n,cr)) bits and 
computes any 7r(i) in constant time and any ir~^{j) in time 0{f{n,a)), for any f{n,a) > 1. With 
this representation they solve, within the chunk, selecta{i) = TT{selecto{X,a — 1) — (a — 1) + i) in 
constant time and access{i) = 1 + ranko{selecti{X,TT~^{i))) in time 0{f{n,a)). 

For ranka{i), they basically carry out a predecessor search within the interval of tt that corre- 
sponds to a: [selecto{X,a — 1) — (a — 1) + 1, selectQ{X,a) — a]. They have a sampled predecessor 
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structure with one value out of Igcr, which takes just 0{a) bits. With this structure they reduce 
the interval to size Igcr, and a binary search completes the process, within overall time 0{\g\ga). 

To achieve optimal time, we sample one value out of We build the predecessor data 

structures of Patrascu and Thorup [27] mentioned in Section |3l Over all the symbols of the chunk, 
these structures take 0{{n/^^)lga) = 0{nlgw) = o{nlga) bits (as we assumed Iga = uj{lgw)). 

The predecessor structures take time 0(lg (see TheoremllUlin the Appendix). The final binary 

search time also takes time 0(lg ^^)- 

Constant-time Access. This time we use the structure of Munro et al. on vr"^, so we compute 
any vr~^(j) in constant time and any 7r{i) in time 0{f{n,a)). Thus we get access in constant time 
and select in time 0{f{n,a)). 

Now the binary search of rank needs to compute values of tt, which is not anymore constant 
time. This is why Golynski et al. [I6j obtained time slightly over Iglgcr for rank in this case. We 

instead set the sampling step to (|^)^*"''^'- The predecessor structures on the sampled values 

1 1 1 

still answer in time O(lgj^), but they take 0((n/(j^) /(".<^) ) Igcr) bits of space. This is o{nlga) 

provided /(n, cr) = o(lg |^). On the other hand, the time for the binary search is 0{j^^ Ig y|^), 
as desired. 

The following theorem, which improves upon Golynski et al.'s |16] (not only as a consequence 
of a higher low-order space term), summarizes our result. 

Theorem 6 A string S[l,n] over alphabet [l,cr], a <n, can he represented using n Ig cr + o(n Ig o") 
hits, so that, given any function = f{n,a) = o(lgj|:^), (i) operations access and select can be 

solved in time 0(1) and 0{f{n,a)), or vice versa, and (ii) rank can he solved in time 0(lg^^). 

For larger alphabets we must add a dictionary mapping [1, a] to the (at most) n symbols actually 
occurring in S, in the standard way. 

4.3 Zero-order Compression 

Barbay et al. [1] showed how, given a representation 7^ of a sequence in n Ig cr+o(n Ig a) bits, its times 
for access, select and rank can be maintained while reducing its space to nHo{S)+o{nHQ{S))+o{n) 
bits. This can be done even if TZ works only for a > (Ign)'^ for some constant c. 

The technique separates the symbols according to their frequencies into O(lgn) classes. The 
sequence of classes is represented using a multiary wavelet tree [lOj . and the subsequences of the 
symbols of each class are represented with an instance of TZ. 

We can use this technique to compress the space of our succinct representations. By using 
Theorem [S] as our structure TZ, we obtain the following result, which improves upon Ferragina et 
al. [lO]. 

Theorem 7 A string S[l,n] over alphabet [1, a] can he represented using nHo{S)+o{nHQ{S))+o{n) 
hits so that operations access, select and rank can be solved in time 0(1 + ^^)- If = w^^^\ the 
space is nHo^S) + o(n) and the operation times are 0(1). 
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To handle larger alphabets, we use Theorem [6] as our structure TZ. The only technical problem 
is that the subsequences range over a smaller alphabet [1,0"'], and Theorem [6] holds only for Iga' = 
Lj(lgw). In subsequences with smaller alphabets we can use Theorem [5l which give access, select 
and rank times More precisely, we use that structure for < /(n, a), else use Theorem[6l 

This gives the following result, which improves upon Barbay et al.'s [1]. 

Theorem 8 A string S[l,n] over alphabet [l,cr], cr < n, can be represented using nHQ{S) + 
o{nHo{S)) + o(n) bits, so that, given any function uj{l) = f{n,a) = o(lg|^), (i) operations 

access and select can be solved in time 0{f{n,a)), and (ii) rank can be solved in time 0{lg^^). 
4.4 High-order Compression 

Ferragina and Venturini [11] showed how a string ^[l,?!] over alphabet [1,0"] can be stored within 
nHk[S) +o(nlgo") bits, for any k = o(lgg.n), so that it offers constant-time access to any O(lg^n) 
consecutive symbols. 

We provide select and rank functionality on top of this representation by adding extra data 
structures that take o(nlgo") bits, whenever Igo" = uj{\gw). The technique is similar to those used 
by Barbay et al. [2J and Grossi et al. |19]. We divide the text logically into chunks, as before, and 
for each chunk we store a monotone minimum perfect hash function (mmphf) fa for each a G [1,0"]. 
Each fa stores the positions where symbol a occurs in the chunk, so that given the position i of 
an occurrence of a, /a(i) gives ranka{i) within the chunk. All the mmphfs can be stored within 
0(o" Iglgo") = o(o"lgo") bits and can be queried in constant time Ill's]. With array X we can know, 
given a, how many symbols smaller than a are there in the chunk. 

Now we have sufficient ingredients to compute vr"^ in constant time: Let a be the iih. symbol 
in the chunk (obtained in constant time using Ferragina and Venturini's structure), then vr~^(i) = 
fa{i) + selectQ{X,a — 1) — (a — 1). Now we can compute select and rank just as done in the 
"constant-time access" branch of Section [4.21 The resulting theorem improves upon Barbay et al.'s 
results p] (they did not use mmphfs). 

Theorem 9 A string S[l,n] over alphabet [1,0"], for a <n and Igo" = uj{lgw), can be represented 
using nHk{S) + o(nlgo") bits for any k = o{lg^n) so that, given any function uj{l) = f{n,a) = 
o{\g^^), {i) operation access can be solved in constant time, [ii) operation select can be solved in 

time 0{f{n,a)), and (ii) operation rank can be solved in time 0{lg^^). 

To compare with the corresponding result by Grossi et al. ^19j (who do use mmphfs) we can fix 
the redundancy to Q( igi|^ ); where they obtain 0(lglgo") time for select and rank, whereas we ob- 
tain the same time for select and our improved time for rank, as long as Ig o" = il(lg wlglgw Ig Ig Ig w) 

5 Conclusions 

This paper considerably reduces the gap between upper and lower bounds for sequence representa- 
tions providing access, select and rank queries. Most notably, we give matching lower and upper 
bounds Q(lg j^) for operation rank, which was the least developed one in terms of lower bounds. 
The issue of the space related to this complexity is basically solved as well: we have shown it can 
be achieved even within compressed space, and it cannot be surpassed within space 0(n • w^^^^). 
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On the other hand, operations access and select can be solved, within the same compressed space, 
in almost constant time (i.e., as close to 0(1) as desired but not both reaching it, unless we double 
the space). 

There are still some intriguing issues that remain unclear: 

1. Golynski's lower bounds [I5j leave open the door to achieving constant time for access and 
select simultaneously, with 0(n(lgc7)^/lgn) bits of redundancy. However, this has not yet 
been achieved for the interesting case u}{lgw) = Igcr = o(lgn). We conjecture that this is not 
possible and a stronger lower bound holds. 

2. While we can achieve constant-time select and almost-constant time for access (or vice versa) 
within zero-order entropy space, we can achieve only the second combination within high- 
order entropy space. If simultaneous constant-time access and select is not possible, then 
no solution for the first combination can build over a compressed representation of S giving 
constant-time access, as it has been the norm [2| [Tl [T9]. 

3. We have achieved high-order compression with almost-constant access and select times, and 
optimal rank time, but alphabets of size super polynomial in w. By using one Golynski's 
binary rank/ select index [14] per symbol over Ferragina and Venturini's representation jll] , 
we get high-order compression and constant time for all the operations for any a = o(lgn). 
This leaves open the interesting band of alphabet sizes r2(lgn) = a = w'^^^\ 

A Upper Bound for Predecessor Search 

We detail a data structure that stores a set S of n elements from universe U = [1, u] in 0(nlg(n/n)) 
bits of space, while supporting predecessor queries in time 0(n Ig '^"~^^" ). We first start with a 
solution that uses 0{nlgu) bits of space. We use a variant of the traditional recursive van Emde 
Boas solution [27]. Let i = Igu (w.l.o.g. assume that i = (Igif — 1) • 2* for some i > 0) be the 
length of keys. We denote the predecessor data structure that stores a set S of keys of length i 
by D^{S). Given an element x the predecessor data structure should return a pair (y,r) where y 
is the predecessor of x in 5" and r is the rank of y in 5 (i.e., the number of elements of S smaller 
or equal to y). If the key x has no predecessor in S (i.e., it is smaller than all keys in S), then it 
should return the pair (0, 0). 

We now describe the solution. We partition the set S according to the most significant i/2 bits. 
We use two operators: h{x) gives the £/2 most significant bits of x, and l{x) gives the £/2 least 
significant bits of x. 

Let Sp denote the set that stores every element x of 5 such that h{x) = p and let Sp denote the 
set Sp deprived of its minimal and maximal elements. Let P denote the set that stores all distinct 
values of h{x) for all x £ S. The data structure consists of the following components: 

1. A predecessor data structure D^/'^{P). 

2. A predecessor data structure D^/'^{Sp) for every p G P such that Sp is non-empty. 

3. A dictionary I{P) (with constant time and linear space) that stores the set P. To each 
element p £ P, the dictionary returns the tuple {m,rm,M,rM,Q) with m (respectively M) 
being the smallest (respectively largest) element in S^, (respectively r^v/) being the rank 
of m (respectively M) in S, and q a pointer to D^I'^{Sp). 
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We have described the recursive data structure. The base case is a predecessor data structure 
r)ig^-i^S) for a set S of size t. Note that the set S is a subset of U = [1,2^^'^-^ = [l,w/2]. This 
structure is technical and is described in Section fA.ll It encodes S using 0{tlg\U\) = 0{tlgw) 
bits and answers predecessor queries in constant time. 

We now get back to the main data structure and describe how queries are done on it. Given a 
key X, we first query I{P) for the key p = h[x). Now, depending on the result, we have two cases: 

1. The dictionary does not find p. Then we query D^/'^{P) for the key p. This returns a pair 
(y, r). We now search /(-P) for y, which returns a tuple (m, r^, M, tm, q)-, and the final answer 
is (M,rM). 

2. The dictionary finds p and returns a tuple (m, r^, M, rM,Q)- We have the following subcases: 

(a) We have x < m. Then we query D^/'^[P) for the key p. This returns a pair {y,r). We 
now search I(^P^ for ?/, which returns a tuple (rn^ ^m? M,rM,q), and the final answer is 
(M, tm). 

(b) We have x = m, then the answer is {m,rm)- 

(c) We have x > M, then the answer is {M^vm)- 

(d) We have m < x < M. Then we query D^/'^{Sp) (pointed by q) for the key l{x). This 
returns a tuple (y, r). The final answer answer is (2^/^ p + y, T'm + if (y, 7^ (0, 0) and 
(m, Tm) otherwise. 

Space Analysis The space can be proved to be 0{nl) bits by induction. For the base case we 
have that t keys are encoded using 0{t\gw) bits. Now for any recursive data structure D^{S) 
we notice that the number of substructures D^/'^{.) will store at most n elements. The reason is 
that the structure D^/'^{P) will store |P| elements, but the the number of elements stored in data 
structures is at most n — |P| (since the maximal and minimal elements in each Sp are not 

stored in those substructures). For the non-recursive part we only have the dictionary I{P), which 
uses 0{n£) space. We note by s{£,n) the space usage of D^{S) with |5| = n. Then the space usage 
follows the recurrence s{£,n) = s{i/2,n) + 0{n£). The solution to this recurrence is 0{nt}. 

In addition we have that each structure D^/^' is pointed by a pointer q. As there are 0{n) 
structures D, all the pointers add up to 0(n Ign) bits. This gives in total O(nlgn) bits0 

Time analysis We query the data structures D^/^'(.) for i = 0,... until £/2* = Igtt; — 1 (we 
may stop the recursion before reaching this point). For each recursive step we spend constant time 
querying the dictionary. Thus the global query time is upper bounded by 0(lg j^)- 

A.l Predecessor Queries on Short Keys 

We now describe the base case of the recursion for 0{\g'w)-h\i keys. Suppose that we have a 
set S oi t keys, each of length £ = \gw/2 — 1. Clearly t < ^/w/2. What we want is to do 
predecessor search for any x over the set S. For that we first sort the keys (in ascending order) 
obtaining an array Then we pack them in a block B of t{£ + 1) consecutive bits (this uses 

tlgw/2 < ^/wlgw/2 < w bits, which is less than one word) where each key is separated from the 

^Plus 0{n\gw), which is dominated by 0{n\gu) unless u < w, in which case we can simply use the technique of 
Section TA. II simulating a shorter machine word. 
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other by a zero bit. That is, we store the element A[i] in the bits B[{i — 1)(^ + 1) + 1, i{i + 1) — 1] 
and store a zero at bit B[i(i + 1)]. 

We now show how to do a predecessor query for a key x on S" in constant time. This is done in 
the following steps: 

1. We first duplicate the key x, t times, and set the separator bits. That is, we compute 
X = {x- (0^1)*) OR (10^)*. 

2. We subtract B from X, obtaining Y = X — B. This does in parallel the computation of 
X — ^[i] for all 1 < i < t, and the result of each subtraction (negative or nonnegative) is 
stored in the separator bit Y[i{(. + 1)]. 

3. We mask all but separator bits. That is, we compute Z = Y and (10^)*. 

4. We finally determine the rank of x. If = then we answer r = t. Otherwise we use the 
mmphf technique described at the end of Section I4.1.3[ Here the space for the mmphf is even 
less, 0{^/wlg'w) = o{w) bits. 

A. 2 Reducing Space Usage 

We now describe how the space usage can be improved to 0(n Ig-u — Ign). For this we a standard 
idea. We partition the set S into n' = 2L's'^J partitions using the Ign' most significat bits. For all 
the keys in a partition Sp, we have that the Ign' most significant bits are equal to p. Let Sp denote 
the set that contains the elements of Sp truncated to their Ign — Ign' least significant bits. We now 
build an independent predecessor data structure L)^^""'^"' (Sp). Each such data structure occupies 
space at most c(|S'p|(lgM — Ign')), for some constant some fixed constant c. We compact all those 
data structures in a memory area A of cn cells of Ig n — Ig n' bits. 

A bitvector B stores the size of the predecessor data structures. That is, for each p G [l,n'] we 
append to B as many Is as the number of elements inside Sp, followed by a 0. Then, to compute 
the predecessor of a key x in S, we first compute p = h{x). We also compute rg = selecto{B,p) — p. 
This will compute the number of elements in Sq for all q < p. Then we query 

L''s«-ig«'(5p) (the 

bit pointer to this data structure is A[c ■ ro(lgu — Ign')]) for the key l{x), which returns an answer 
{y,r). We now have three cases: 

1. If p = 0, then the final answer is (0,0). 

2. Otherwise, if the returned answer is (0, 0), then we query 

L»'g"-is"'(5p-i) for the key lig^-ig'^', 
which returns a pair (y, r) and the final answer is {{p — l)n' + y, ro). 

3. Otherwise, the final answer is just {pn' + y, tq + r). 

It is easy to see that the data structure occupies 0(n(lgti — Ign)) bits and it answers queries 
in time 0(lg "g"^^ " ) ■ We thus have proved the following theorem: 

Theorem 10 Given a set S of n keys over universe [l,u], there is a data structure that occupies 
0{n{lg{u/n))) bits of space and answers predecessor queries in time 0(lg ^^["{^""^ )■ 
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